class: center, middle, inverse, title-slide # Lecture 3 ## Summary Statistics ### Psych 10 C ### University of California, Irvine ### 03/30/2022 --- ## Summary Statistics - Another common way in which information from an experiment or a survey can be summarized is by using a statistic. -- - Statistics are **functions** of **random variables** in an experiment that can be used to convey information about the location of our observations and how they vary. -- - Not all the variables in an experiment are equally important, so we don't usually look for ways to visualize them, but we still want to make sure that we give all the information we can about our sample. -- - In that case we can use summary statistics to report some of their properties. --- class: inverse, center, middle # Random Variables and Functions --- ## Random variables - Statisticians are very bad at naming things ... -- - So when you think of the words random and variable what comes to mind first? -- - The formal definition is the opposite! -- - **Definition:** A random variable is a function of the outcomes of an experiment. --- # Functions - Functions have formal definitions, however, what we have to remember is how they "work". -- - Intuitively, we can think of functions as rules about how to associate two groups of "things". -- - For example, if throw a coin and then record if it landed as heads or tails. A function could be a rule that states: - `\(x = 0\)` If the outcome is tails and `\(x = 1\)` if the outcome is heads. -- - This is a simple rule that just let's us assign numbers to a variable `\(x\)` depending on the result of a coin toss. In other words, `\(x\)` is a function of the outcome of the experiment. --- ## Functions - Another simple function would be `\(y = x + 1\)`. - This function, tells us that whatever the value of `\(x\)` is, we can get the value of `\(y\)` by adding `\(1\)` to `\(x\)`. -- - Regardless of how complex they look, we can think of functions as a "map" that specifies how to get from the values of one variable to the values of another. --- ## Back to Random Variables - Random variables are neither random nor variables, they are simply the rules we use to assign numbers to the outcomes of an experiment. -- - In other words they are deterministic functions (see statisticians are really bad at naming). -- - In our previous example with the coin toss, `\(x\)` can be considered a random variable, as it is a rule that allows us to assign a numeric value (0 or 1) to the outcome of the experiment (heads or tails). --- ## Example with the memory experiment. - Let's think of our memory experiment. For a given participant and test (say participant 1, test 1), each time we present a word it can be the case that the participant response was that the word was on the original list or not. -- - Additionally, each word that we present was either on the test or it wasn't. -- - We can treat our participants responses as probabilistic given that we are not sure how any person would respond to any word. -- - Now we can create a random variable that says: -- - if the word was on the original list **and** the participant responds that the word was on the original list, then `\(x = 1\)`. -- - if the word was in the original list **and** the participant responds that the word was **not** in the original list, then `\(x = 0\)` -- - If we record the value of `\(x\)` for all trials for a given participant we would have 50 different values that indicate, for each word presented whether that particular response was correct or not. --- ## Statistics - The examples we have talked about are all Statistics, a statistic is just a function of our sample (data). -- - In our memory experiment we don't have the record of the responses of each participant to each word, we have something "simpler". -- - We have another random variable that adds all the correct responses. -- - We have lost some information while doing so. Can you guess what information has been lost? -- - We made a trade between the information about the order of the correct responses in exchange for a summary of the experiment, the total number of correct responses. --- ## Statistics - Every time we use a statistic (function of our experimental outcomes) we either: -- - keep the same information (for example assigning a value of 1 to a heads in the coin toss) -- - Lose information (for example when we take the number of correct responses in the memory experiment) -- - In the majority of the examples of this course this will not be a problem, however, it is important to keep this loss of information in mind. --- class: inverse, center, middle # Commonly Used Statistics ## The Mean --- ## Mean - As you know from Psych 10 B, one of the properties of a r.v. that we are interested in is its expected value. -- - This value can be calculated with the formula: `$$\mathbb{E}(x) = \sum_x x \ p(x)$$` -- - We are faced with a problem here as, when we gather data form an experiment, we don't know the probability of each of the values of our random variable. -- - In other words we don't know `\(p(x)\)`. -- - For example, what is the probability that a participant has 40 correct responses? --- ## Mean - Fortunately, we can mathematically prove that the **average** of a random variable will be close to the expected value. -- - This is true regardless of the values of `\(p(x)\)`. -- - Of course this is just an approximation and will therefore be prone to error. -- - But it will be our best guess! -- - Calculating the average is simple: `$$\bar{x} = \sum_{i = 1}^n \frac{x_i}{n}$$` -- - Here we use `\(x_i\)` to indicate each of our observations (remember that in the memory experiment we have 50 0's or 1's). The variable n represents the total number of observations. --- # Example: average number of correct responses - Let's go back to the memory example, and look at the mean age of our participants. -- ```r mean_age <- memory %>% summarise("bar_age" = mean(age)) %>% pull(bar_age) ``` -- - We can see look at the mean age of our participants by typing the name of the variable in the console: ```r mean_age ``` ``` [1] 36.68 ``` --- # Note: - For your homework you will need to have average values show in text, an easy way to do this is using the following code in the text: The mean age of the participants on the experiment was ``` ` r name-of-variable` ``` - which will be printed on the pdf as: The mean age of the participants on the experiment was 36.68.